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Abstract 

We introduce shower deconstruction, a method to look for new physics in a hadronic environment. 
The method aims to be a full information approach using small jets. It assigns to each event a 
number x that is an estimate of the ratio of the probability for a signal process to produce that 
event to the probability for a background process to produce that event. The analytic functions 
we derive to calculate these probabilities mimic what full event generators like Pythia or Herwig 
do and can be depicted in a diagrammatic way. As an example, we apply this method to a boosted 
Higgs boson produced in association with a Z-boson and show that this method can be useful to 
r ^7 l discriminate this signal from the Z+jets background. 
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I. INTRODUCTION 



A central problem for data analysis at the Large Hadron Collider (LHC) is to find the 
signal for the production of a new heavy particle or particles against a background of jets 
produced by standard model processes that do not involve the sought heavy particle. Exam- 
ples include searches for supersymmetric partners of the quarks and gluons and searches for 
the Higgs boson. While such searches focus on leptonic final states, most of the sought new 
physics resonances have a large branching ratio to hadrons. Thus, it is of great importance 
to be able to disentangle hadronically decaying particles with masses around the electroweak 
scale from large QCD backgrounds. 

The decay products of a new very heavy particle will appear in the detector as one or 
more jets. There may also be jets from initial state radiation. The jets will contain subjets. 
In this paper, we call the subjets microjets. They are defined with a standard jet algorithm 
but with a small effective cone size R. The pattern of microjets in events arising from the 
new particle decay will differ from the pattern of microjets in background events that do 
not involve new particles. One can take advantage of this difference to separate signal from 
background. 

In this paper, we propose a method for separating signal from background by analyzing 
the distribution of the microjets. This method has the potential to be effective in quite 
general circumstances. However, for a first application, we choose a process in which we 
are looking at the microjets contained in a larger jet that results from the decay of a heavy 
particle with large transverse momentum, that is a highly boosted heavy particle. 

There are several methods already available for the analysis of the structure of the mi- 
crojets produced by the decay of a highly boosted heavy particle. Two of these methods, 
trimming [1] and pruning [21 13] can be characterized as generic in that they have the po- 
tential to discover new physics signals even if one does not have in mind a particular new 
physics scenario. Other methods, including the one proposed here, are adapted to searches 
for particular new physics signals. These include mass drop with filtering and b-quark tag- 
ging [4|, the matrix element method [SHE], and the template overlap method These 
last two methods bear some resemblance to the method proposed in this paper. One can 
also combine methods [10J. For further applications see Refs. [TTH3I] and for a review see 
Ref. [32]. 

The example that we consider in this paper is the production of a Higgs boson in associ- 
ation with a high transverse momentum Z-boson, where the Z-boson decays into e + + or 
fi + + fi~ and the Higgs boson decays into b + b. This example was analyzed in Ref. jl] . Since 
the Higgs boson recoils against a high transverse momentum Z-boson, the Higgs boson has 
a large transverse momentum and is easier to find than if it had low transverse momentum. 
Nevertheless, there is a large background to this process from standard model processes that 
do not involve the Higgs boson, so some ingenuity is required to separate the signal from 
the background. 

The idea of this paper is to define an observable x that is a function of the observed 
configuration of the final state microjets in an event and distinguishes between a sought 
signal and the background. To do that, we define x as the ratio of the probability that 
the micro jet configuration observed would arise in a signal event to the probability that it 
would arise in a background event. We use a parton shower algorithm for this purpose. 
However, our parton shower algorithm is massively simplified compared to Pythia [33] or 
Herwig [M] in order that we can compute the probability for a given microjet configuration 
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analytically. We call the method proposed here shower deconstruction. 

II. OVERVIEW AND EVENT SELECTION 

As stated in the introduction, the idea of this paper is to define an observable x that 
is a function of the configuration of the final state in an event and distinguishes between 
a sought signal and the background. The method that we propose is quite general, but in 
order to explain it with reasonable clarity, we need to consider a specific process. Our choice 
of process is guided by the desire to have a case that is relatively simple to explain. The 
example that we use is the search for the Higgs boson using the process p + p — ± H + Z + X 
where the Z-boson decays to [i + + (or e + + e~) while the Higgs boson H decays to b + b. 
We try to separate this from the background process p + p — > jets + Z + X jl]. 

A. Event selection 

We simulate an analysis of data by using events generated by Pythia [33]. In order 
to make the Higgs boson easier to find, we demand that the Z-boson against which it 
recoils has a large transverse momentum. Specifically, we select events consistent with a 
leptonically decaying Z-boson for which the leptons are central (\yi\ < 2.5) and fairly hard 
(pt,i > 15 GeV). The invariant mass of the leptons is required to match the Z-boson mass, 

\m l+ i- - m z \ < 10 GeV . (1) 

The reconstructed Z-boson is required to be highly boosted in the transverse plane, 

PT,l+l~ ^ PT,min — 

200 GeV . (2) 

We next combine final state hadrons in simulated detector cells of size 0.1x0.1 and adjust 
the absolute value of the momentum in each cell so that the four-momentum is massless. We 
remove cells with energy less than 0.5 GeV. We then use these cells as input to the anti-fcr 
jet-finding algorithm [35] with a large effective cone size, Rp = 1.2. For the recombination 
of the jet constituents we use Fastjet [36]. We find the jet with the highest transverse 
momentum of all such jets in the event and require its transverse momentum to be larger 
than pr.min- This is the "fat jet." 

Those selection cuts force the Higgs boson to recoil against the Z-boson with a large 
transverse momentum, so that the decay products of the Higgs boson are fairly well colli- 
mated. 

We denote the cross section for signal events that pass these cuts by ctmc(S) and denote 
the cross section for background events that pass these cuts by ctmc(B). With some help 
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from next-to-leading order calculations, we estimate 1 

<7 M0 (S) = 1.57 fb , 

cxmc(B) = 2613 fb , (3) 

Q~Mc(S) _ 1 

a MC (B) ~ 1664 

Our analysis makes use of events generated by a Monte Carlo event generator that we use 
and regard as an accurate representation of nature. We renormalize the event generator 
cross sections by constant factors for signal and background calculations so as to match the 
cross sections given in Eq. (|3]). We will generally use "MC" subscripts to denote quantities 
calculated by a Monte Carlo event generator supplemented by some next-to-leading order 



information. As noted above, we use Pythia in our calculations; in Sec. XI, we also present 
results using Herwig. 



B. Variables describing the final state 

In principle, the final state could be described by the momenta and flavors of all final state 
particles. However, we simplify this. First, we select events and use the anti-fcy algorithm 
to define the "fat jet" that recoils against the Z-boson, as described above. 

We use the kx jet-finding algorithm [38] to group the fat jet into subjets, which we call 
microjets. We choose the effective cone size in the kx jet-finding algorithm to be R = 0.15. 
This size is chosen to correspond roughly to the angular resolution of calorimeter topological 
clusters in the ATLAS experiment and to be a little larger than the ALTAS calorimeter 
angular resolution of about 0.1 [SH]. We do not want any of the microjets to be exactly 
massless, so we add 0.1 GeV to the energy of each microjet. 

Typically, the number of microjets found is between six and ten, but a few events have 
even more microjets. The computational time needed to analyze an event increases quite 
quickly with the number of microjets. Accordingly, we choose a number iV max with default 
value iV max = 7 and discard the lowest transverse momentum microjets if there are more than 
JVmax microjets, keeping the iV max microjets that have the highest transverse momenta. In 
fact, we find that the lowest transverse momentum microjets carry little useful information: 
we have varied iV max between 5 and 9 and find that the statistical significance of the results 



that we obtain, as discussed in Sec. |XII[ increases only slowly with N max . 

The microjets found by this procedure are described, in part, by their momenta {p}n = 
{px,.. . ,p N }, with p\ > 0. 

For some microjets j, we also provide a 6-quark tag, tj. To qualify for a tag, the microjet 
must be among the three microjets in the event with the highest pt values and it must have 
Pt > Pr g ' where our default value is p^ s = 15 GeV. For microjets j that do not qualify for 
a tag we set tj = none. In the simplest implementation, one would take tj = T if microjet j 
contains a b or b quark and otherwise define tj = F. We simulate 6-tagging of microjets in 
experiment by using more realistic 6-tagging for Pythia events: 



1 We generate events for Z + jet — > l + l~ + jet and HZ — > bb using Pythia in a configuration with 
large transverse momentum and normalize the cross section to the one obtained from MCFM |37j with 
the same cuts. Then we calculate the cross section after selection cuts based on the number of events that 
pass the selection cuts. 
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• If any hadron in microjet j contains a b or b quark, then we set tj = T with a 
probability P(T\b) and tj = F with a probability 1 — P(T\b). 

• If no hadron in microjet j contains a b or b quark, then we set that tj = T 
with a probability P(T|~6) and tj = F with a probability 1 — P(T|~6). 

Our default value for the 6-tagging efficiency is P(T|6) = 0.6 while our default value for the 
mistag probability is P(T|~6) = 0.02 @Q]. 

This procedure of defining microjets within the fat jet gives a somewhat "coarse grained" 
description of the part of the event that is of interest: the momenta and b-quark tags, 
{p,t}N = {Pi,h; ■ • • ;p N ,t N }, of the microjets. 



C. Probabilities according to Monte Carlo event generator 

We denote by Pmc({p, £}jv|£>) the probability that a signal event has a microjet configu- 
ration {p,t}N, as determined by the Monte Carlo event generator that we use and regard 
as an accurate representation of nature: 2 

(4) 

Similarly, we let the probability that a background event has a microjet configuration {p, t}^ 

firifaW)^^. (5) 
cr M c(B) d{p,t\ N 

We now seek an observable that does a good job of distinguishing signal events from 
background events. Our sought observable is to be a function £}jv) °f the microjet 
configuration. It will also be a function of the parameters of the standard model, especially 
the mass m# of the Higgs boson. 

As a preliminary step, we define a quantity Xmc({p, OaO by 

We would like to use Xmc({p ? Ojv) as our observable. In fact, if one considers that the Monte 
Carlo event generator is accurate and if one could construct xmc as a function of {p, t}jv, 
then this could be considered to be the ideal observable. 

Why might one consider Xuc to be an ideal observable? To see this in the simplest 
context, let us suppose that we want to examine data using a cut: we accept events if 
C({p,t}iy) > 0, where C({p, t}^) is some function that we are at liberty to make up. The 
signal and background cross sections with this cut are 

Axmc(S) 
d{p,t} N ' 

^mc(B) 1 ' 



a c (S)= jd{p,t} N Q(C({p,t} N )) 
a c (B)= Jd{p,t} N e(C({p,t} N )) 



d{p, t} 



N 



Here the differential dpj for each microjet j can just mean d pj. 
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Choose a value crc(S) that we want for the signal cross section and require that the cut 
produce this value of signal cross section. With this constraint on the signal cross section, 
we will have the best statistical significance for a measurement if we make erc(B) as small as 
possible. Thus we seek to choose the cut so as to minimize oc(B) with <7c(S) held constant. 
The solution to this problem is to choose C({p, t}^) such the surface C({p, t}^) = is a 
surface of constant Xmc({p, £}at)- That is, we should measure the cross section inside a cut 
defined by 

c ({p, t} N ) = Xmc({p, t} N ) - xo (8) 

for some Xo- If we make any small adjustment to this by removing an infinitesimal region 
with XMc({p,t}N) > Xo from the cut and adding a region having the same signal cross 
section but with Xuc({p, < Xo, we raise the total background cross section within the 
cut while keeping the signal cross section the same. Thus using contours of Xmc({p, £}jv) to 
define our cut is the best that we can do. 

What value of Xo should one choose? For a simple optimized cut based analysis with a 
given amount of integrated luminosity, one would choose Xo so as to maximize the ratio of the 
expected number of signal events to the square root of the expected number of background 
events. We discuss this further in Sec. |Xll 

Instead of using an optimized cut on xmc to separate signal from background, one could 
imagine using a log likelihood ratio constructed from xmc- We do not discuss that method 
in this paper. 

Now we must face the fact that to construct Xmc({p, *}jv)j we would need two things: 
the differential cross section to find microjets {p, t}^ in background events and then the 
differential cross section to find microjets {p, t} N in signal events. In each case, we would 
consider this differential cross section in a parton shower approximation to the full theory. 
Unfortunately for us, a parton shower produces d<jMc(S)/(i{p, t}^ and d<7Mc(B)/d{p, t} N 
by producing Monte Carlo events at random according to these distributions. If we have 
7 microjets described by 4 momentum variables each and we divide each of these 28 vari- 
ables into 10 bins, then we have approximately 10 28 /7! ~ 10 24 total bins (accounting for 
the interchange symmetry among the 7 microjets). The parton shower Monte Carlo event 
generator will fill these bins with events, but it will be a long time before we have of order 
100 counts per bin in order to estimate daMc{S)/d{p, t}jv and ddMc(B)/d{p, t}^ at each bin 
center. Thus it is not practical to calculate Xmc{{p, ^}iv) numerically by generating Monte 
Carlo events. It is also not practical to calculate Xuc({p, Ow) analytically using the shower 
algorithms in PYTHIA or HERWIG. These programs are very complicated, so that we have 
no hope of finding Pmc({p, *}jv[S) and -Pmc({p,*}tv|B) for either of them. 



D. Probabilities according to simplified shower 



What we need is an observable x{{Pi £}n) that is an approximation to Xmc{{p> Qn) such 
that we can calculate x({p^}n) analytically for any given {p, t}x- For this purpose, we 
define a simple, approximate shower algorithm, which we will call the simplified shower 
algorithm. We let P({p, i}jv|S) and P({p, £}jv|B) be the probabilities to produce the mi- 
crojet configuration {p, t}x in, respectively, signal and background events according to the 
simplified shower algorithm. Define 
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FIG. 1: d<TMc(B)/^logX f° r background events (upper curve) and <ia"Mc(S)/dlogx for signal 
events (lower curve) for samples of signal and background events generated by Pythia. We use 
the cuts described in Sec. Ill Al 

This function, x({p^}n) without the "MC" subscript, is the observable that we use. We 
may call the calculation of £}jv) shower deconstruction. 

The parton state with N microjets is a possible intermediate state in a parton shower. 
We seek to determine the probability that this intermediate state with parameters {p, t}^ 
is generated. We try to build enough into the simpler shower to provide a reasonable ap- 
proximation to QCD and the rest of the standard model. Furthermore, we can define the 
shower so that the deconstruction is as simple as we can make it, even if that means that 
the corresponding shower algorithm is not so practical as an event generator. For instance, 
an implementation of the simplified shower algorithm as an event generator might generate 
weighted events in a way that makes unweighting the events costly in computer time. Addi- 
tionally, probability conservation might be only approximate, so that the generated weights 
for different outcomes do not sum exactly to one. No matter: we are not going to use the 
simplified shower algorithm to generate events anyway. Additionally, we can ignore any 
factors in P({p,t} N \S) and P{{p, t}]y\B) that are common between them for each {p, t} N 
since such factors cancel in \. 

Our construction will be far from perfect, and it can be useful even if it is not perfect. 
We will use Pythia to measure the cross section cfo"Mc(S)/<ilogx to have signal events with 
a given value of x an d the corresponding cross section dcr MC! (B)/c?logx to have background 
events with this value of x- m Fig- [0 we show these two functions for the simplified shower 
as defined in the following sections. In this illustration, we see that increasing x favors signal 
compared to background. 

There is another way to present the results in Fig. [T] that is more informative. Let us 
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FIG. 2: Plot of s/b versus s, where s and b are denned in Eq. (10). We use samples of signal and 
background events generated by Pythia as in Fig. [TJ 



define integrated signal and background cross sections above a cut: 



s (x) 



^Mc(S) 



dx 



Kx) = I d x 



(10) 



It is useful to use s in plots as the independent variable. With this definition, s runs from 
to ctmc(S) and s = corresponds to % = oo. We can then examine the ratio of signal to 
background cross sections, s/b, considered as a function of s. 

In Fig. [2j we display the information in Fig. [I] as a plot of s/b versus s. We have used here 
the x({Pi 1}n) from our simplified shower algorithm. If we could somehow use Xmc({p, 1}n), 
based on the same Monte Carlo event generator that we used to generate events, then we 
would obtain a curve for s/b versus s that is everywhere higher. No algorithm could produce 
a curve above this limiting curve, but we have no way of determining the limiting curve. 

We see in Fig. [2] that s/b is small for large s but that there is a region of s in which s/b 
is not too small. This is what one hopes to accomplish with shower deconstruction. We will 
return in Sec. |XI]to a discussion of numerical results. 

In the following sections, we describe how shower deconstruction works. Conceptually, 
it is very simple. However, there are quite a few ingredients. That is because we seek to 
approximate the probability that a parton shower will give a certain set of microjets and 
there are quite a few ingredients in a parton shower. The simplified parton shower that 
we describe in the following sections is modeled on the general parton shower algorithm 
described in Ref. jH] and, in particular, on its leading color, spin-averaged version [42J. 
It is basically a virtuality ordered shower, although we modify the evolution variable in 
Refs. (HI H2] to be virtuality /energy instead of just virtuality. This shower is a partitioned 
dipole shower, and we choose a dipole partitioning function from Ref. 
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FIG. 3: A shower history for a background event. The "star" vertex represents the production 
of a high pt parton from the hard interaction. The "diamond" vertices represent production of 
partons by initial state radiation. Each parton can split into two daughter partons at a shower 
vertex, represented by a small circle. In this background event, one of the gluons splits into a light 
q-q pair. 

A shower algorithm in which one can calculate the probability to produce a given parton 
configuration has been proposed in Ref . [H] . The aims of this algorithm are rather different 
from ours in that the algorithm of Ref. jH] is designed to be practical as an event generator. 
Accordingly, the methods used are rather different from ours. 

III. ORGANIZATION OF SHOWER DECONSTRUCTION 

In this section, we explain the overall organization of shower deconstruction, beginning 
with the concept of a shower history. 

A. Shower histories 

In general, a shower history h is a tree Feynman diagram showing how N final state 
partons (the microjets) could have evolved starting with a hard scattering process for signal 
or background events. In our application, we simplify quite a lot. First, we look not at the 
whole event, but only at the microjets that make up the fat jet. For background events, 
we assume that the microjets came from a parton shower induced by a high px parton plus 
parton showers starting from initial state radiation (including radiation from the underlying 
event), as illustrated in Fig. [3] For signal events, we assume that the microjets came from 
the decay products of a Higgs boson (through H — » b + b) plus parton showers starting from 
initial state radiation, as illustrated in Fig. |4j 

Each parton in the shower history carries a flavor label We make some simplifications 
in the flavor structure of the simplified shower. 

1. For shower histories corresponding to signal events, we have a Higgs boson intermediate 
state. That is, we have a parton with flavor /j = H . 
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FIG. 4: A shower history for a signal event. The dashed line is the Higgs boson, produced in the 
hard interaction. It decays into a 6-quark and a 6-quark, which carry arrows representing the flow 
of 6-flavor. The QCD shower splitting of a 6-quark is to a 6-quark plus a gluon. In this event, one 
of the gluons splits into further gluons. 

2. The Higgs boson decays into a 6-quark and a fe-quark, so we need flavors fi — b and 
fi = b. 

3. A b- or 6-quark can emit a gluon, so we have partons in our shower histories with 
flavor fi = g. 

4. A gluon can split to a 6-quark and a 6-quark. 

5. A gluon can also split to a light quark and a light antiquark, so we have partons in 
our shower with flavors fi = q and fi = q. We do not distinguish whether the light 
quark pairs are (u, u), (d, d), (s, s), or (c, c). Instead, we simply multiply the emission 
probability for one flavor of light quark by — 1 = 4, where = 5 is the number of 
quark flavors including the b quark. 

6. As an approximation, we treat the initial hard parton in a background event as being 
a gluon. Similarly, we treat partons radiated from the incoming initial state partons 
as being gluons. 

A shower history in which a gluon splits into a b-b pair is illustrated in Fig. [5j 
The probabilities P({p, £}tv|B) and P({p, ^} I S) in our shower model will consist of a sum 
of partial probabilities corresponding to different shower histories. In the following sections, 
we assume that we have picked a shower history h and we seek to construct the probability 
P{{Pi *}at|B, h) or P({p, £}tv|S, h) corresponding to that shower history. 

We will return in Sec. [X] to the question of how to construct the shower histories in a 
reasonably efficient fashion. First, though, we need to define the factors corresponding to 
the vertices and propagators in our shower history diagrams. We begin with a description 
of the color flow. 
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FIG. 5: A shower history for a background event in which a high pjr gluon splits to a b + b pair. 
The QCD shower splitting of a 6-quark is to a 6-quark plus a gluon. The b and b quarks radiate 
gluons and one of the gluons splits into two gluons. 

B. Color connections 

We work in the standard leading color approximation and will need to keep track of color 
connections. 

Consider a final state splitting in which a gluon labeled J splits into two daughter gluons. 
Let the label of the daughter that carries the 3 color of the mother parton J be A. We draw 
this daughter parton on the left in our diagrams. Let the label of the daughter parton that 
carries the 3 color of parton J be B. We draw this daughter parton on the right in our 
diagrams. We track the angle variables of two color connected partner partons to parton J. 
Parton k(J) L carries the 3 color that is connected to the 3 color line of parton J. Parton 
k(J) R carries the 3 color that is connected to the 3 color line of parton J. The labels 
k(J) L and k(J) R specify lines in the shower history diagram, not necessarily final microjets. 
Given the labels of the color connected partners to the mother parton J, we assign the color 
connected partners of the daughter partons. The two daughter partons are color connected 
partners of each other and each inherits one of the color connected partners of the mother. 
That is 

k(A) h = k(J) h , k(A) R = B , (11) 

and 

k(B) L = A, k(B) R = k(J) R . (12) 

If parton J is a quark, then it has a color connected partner k(J)n that carries the 3 
color connected to the quark's 3 color. There is no /c(J)l partner. The quark can split into 
daughter quark A and a daughter gluon B, which we draw on the right because it carries 
the 3 color of the mother quark. The color connected partners of the daughter partons are 
then 

k(A) R = B , (13) 

and 

k(B) L = A, k(B) R = k{J) R . (14) 
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Similarly, if parton J is an antiquark, then it has a color connected partner /c(J)l that 
carries the 3 color connected to the antiquark's 3 color. There is no k(J)n partner. The 
antiquark can split into daughter antiquark B and a daughter gluon A, which we draw on 
the left because it carries the 3 color of the mother antiquark. The color connected partners 
of the daughter partons are then 

k(A) L = k(J) L , k(A) R = B , (15) 

and 

k(B) L = A . (16) 

Consider a final state splitting in which a gluon with label J splits into q + q ( or b + b). 
Let the label of the daughter antiquark be A; we draw it to the left because it carries the 3 
color of the mother parton J. Let the label of the daughter quark be B; we draw it to the 
right because it carries the 3 color of the mother parton. The color connected partners of 
the daughter partons are 

k(A) L = k(J) L , k(B) R = k(J) R . (17) 

Finally, consider the decay of a Higgs boson, labelled J, into b + b. Since the Higgs boson 
is a color singlet, the b and b quarks are each other's color connected partners. We draw the 
6-quark on the left and call its label A, while we draw the 6-quark on the right and call its 
label B. The color connected partners of the daughter partons are 

k(A) R = B, k(B) L = A . (18) 

These procedures define color connections recursively. To start the recursion the initial 
hard parton in a background event has undefined color connected partners: k( J) L = k(J) R = 
undefined. If we knew the complete Feynman diagram representing a shower history, then 
all color connected partners would be defined, but we know about only partons that are part 
of the fat jet, so we have an incomplete shower history. The true color partners of the initial 
hard parton could be partons that are not in the fat jet, or they could be partons from initial 
state radiation. Because we do not know the true color connections, we leave them undefined. 
Similarly, partons created as initial state radiation have undefined color connections in our 
approximation. As the shower progresses, the undefined color connections are inherited, 
but most partons later in the shower have defined color connections. 3 

C. Kinematics 

We need to describe the kinematics of a splitting of a parton J into two partons, call 
them A and B. There is a big advantage to making the simplest choice for the relation 
among the corresponding momenta: 

Pj = Pa +Pb • (19) 



3 As we will see, partons with undefined color connections are allowed to radiate soft partons into an 
unrestricted angular region. Since all of our partons are contained in the angular region of the fat jet, 
this does not cause much of a problem. However, if we wanted to increase the angular region considered 
in shower deconstruction, we would need to specify color connected partners for all partons. 
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FIG. 6: Probability to create the initial parton in the hard interaction. The left hand vertex is for 
the background process, the right hand vertex is for the signal process. 

This means that p 2 j > even if p\ = and p 2 B = 0. In shower generation (as distinguished 
from shower deconstruction) one does not do this. One wants p 2 = for all intermediate 
partons since one does not know the virtualities of daughter partons at the time that the 
splitting is generated. When all partons have p 2 = 0, one has to take some momentum from 
somewhere in order to balance momentum. If we did that for shower deconstruction, the 



required treatment would be difficult. For shower deconstruction, we simply use Eq. (19) 
and allow all partons to have p 2 > 0. Then each parton (or jet) is characterized by four 
variables, one of which is fi 2 = p 2 . 

With this choice, each parton is described by four variables: its virtuality /i 2 , its rapidity y, 
its azimuthal angle 0, and the absolute value k of its transverse momentum. The (+, — , 1, 2) 
components of the momentum of the parton are then 4 

p = \/k 2 + /j 2 e y , \Jk 2 + n 2 e~ y , k cos 0, k sin . (20) 

We are now ready to turn to the vertices of our shower history diagrams. 

IV. THE HARD INTERACTION VERTEX 

We first need a factor to represent the hard scattering process that creates the starting 
high pt parton that forms the fat jet, or, more exactly, forms the part of the fat jet that is 
not from initial state emissions. This factor is represented by the "star" vertex, as in Fig. |6j 
We consider first the hard vertex for background events. 

A. Background 

First, we impose a requirement that the scattering process that creates the starting high 
Pt parton is indeed the dominant hard scattering process in the event. We define Q 2 to be 
the square of the transverse momentum of the fat jet plus the square of its mass, 



Q 2 = E + E ft) ■ ( 21 ) 

\i£fat jet / \iefat jet / 

We then define kr,i to be the transverse momentum of all microjets that are part of the fat 
jet but are not in the decay products of the initial hard parton. That is, kx,i is the transverse 



We use momentum components p^ 1 = (p° ±p 3 )/\/2- 
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momentum of all microjets associated with initial state and underlying event radiation. We 
demand that 

k 2 T1 <Q 2 /4 . (22) 

For the probability density associated with the creation of the initial hard parton, we use 
a factor g 

H 9 = iV p V (^P^ ^ < Q74) • (23) 

Here k is the transverse momentum of the initial hard parton. The factor l/k 2 . is an approx- 
imation to the fog dependence of the square of the hard matrix element. The hard scattering 
cross section is also proportional to a product of parton distribution functions. We approx- 
imate the dependence on the parton distribution functions by including a factor l/(feo)' ;Vpdf , 
where our default value for the exponent is A^ df = 2. This value yields an approximation to 
the one jet inclusive cross section at the Large Hadron Collider, as illustrated in Fig. 11 of 
ref. |45j . The parameter pr,min is the smallest allowed transverse momentum of the Z-boson 
against which the initial hard parton recoils, pr,mm = 200 GeV, Eq. (pi). The normalization 
factor -/Vpdf(PTmin) JVpdf * s chosen so that the integral J dk\R from p^ min to infinity is 1. 
There is an additional normalization factor that we omit because it cancels between the 
hard scattering cross sections for background and for signal. 



B. Signal 

We also need a factor to represent the hard scattering process that creates the Higgs 
boson. For this purpose, we use a factor 



H 



ii 



JV P df 



PT,min 



+ m 2 H 



k 2 H + m% 



JV P df 



1 



k 2 H + m 2 H 



0(4 1 < Q 2 /4) 



(24) 



as in Eq. (23). Here ku is the transverse momentum of the Higgs boson, m# is the Higgs 



boson mass, fc<r,i is the total transverse momentum of all partons emitted in the initial state, 
and Q 2 is defined in Eq. (21). The remaining factors provide an approximation to the 



dependence on the parton distribution functions. The default values of the parameters are 
2 and p Tjmin = 200 GeV as in Eq. (& 



JV P df 



V. INITIAL STATE AND UNDERLYING EVENT RADIATION 

We have seen how to model the hard interaction that creates either a high pt QCD parton 
or a Higgs boson. Now we need to model initial state and underlying event radiation, defining 
an emission probability His as illustrated in Fig. [7j Consider the probability for the emission 
of a gluon with positive rapidity from an initial state parton that participates in the hard 
interaction. Since the gluon has positive rapidity, this emission is predominantly from the 
active parton "a" from hadron A. We use "b" as the label for the other active incoming 
quark, from hadron B. We take p a to be in the + direction and pb to be in the — direction. 
We suppose that the emitting parton "a" has a color connected partner with label k. For 
the processes that we examine, the initial state partons are likely to be quarks, so there 
is only one color connected partner. The emitted parton carries the label J. As a simple 
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FIG. 7: Probability to create a parton by initial state radiation, including both perturbative and 
nonperturbative radiation. 

approximation, we assume that it is a gluon. We start with the dipole formula for the 
squared matrix element for the emission, 

^dipoic ~ ~7T ( 47r "s) • (25) 

2 Pj-P^Pj- Pk 

Writing pj ■ pk in components, this is 

ff« ^C AP tp~ t ^ (26) 

pip j (pjPk +Pjpi - k ±,j ■ k ±,k) 

In order to simplify this, we assume that PjP^ 3> P~jp\ and PjP^ \k±,j ■ k±,k\- With this 
approximation, 

(27) 

2 PjPj 



H~°^i- (28) 



This is exactly 

87ra s C A 
k 2 j + ti 

This emission probability applies for emitted gluons with positive rapidity, emitted from the 
active parton in hadron A. It also applies for emitted gluons with negative rapidity, emitted 
from the active parton in hadron B. To cover all gluons emitted in the central region, we 



simply use Eq. ( 28 ) for both positive and negative rapidity. (We note that H is independent 



of rapidity with the approximations that we have used.) 



In Eq. (28), we choose the squared transverse momentum kj as the argument of a s and 
we neglect /ij compared to k 2 y. 

H ^ 8na s (kpC A (2Q) 

This expression should then be a fairly good approximation for the emission probability as 
long as kj is large enough for the emission to be purely perturbative and small enough for 
the parton momentum fraction carried away by the emitted gluon to be negligible. If the 
parton momentum fraction carried away by the emitted gluon is not negligible, there should 
be an additional factor 

(l-z)f(x/(l-z),kl) 

R = WM • () 

where x is the momentum fraction of the parton after emitting the gluon, zx/(l — z) is 
the momentum fraction of the emitted gluon, x/(l — z) is the momentum fraction of the 
parton before emitting the gluon and the functions / are parton distribution functions. 
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FIG. 8: The distribution of initial state jets as a function of their transverse momentum kj as 
produced in Pythia compared to the distribution produced by -ffis and its perturbative and non- 
perturbative parts. The distributions are integrated over all azimuthal angles and over the rapidity 
range —2 < y < 2. For our model, we use i?is from Eq. (32), calling the first term Hf§ rt and the 
second term H^ p ' . The distribution from His is shown as a heavy line, while the steeper line below 
is from H^ p ' while shallower line below is from Hf£ vt . 



(See Eq. (8.26) of Ref. [4"T]). When k) < Q 2 we have z < 1 and R « 1. However, the 
approximation R « 1 breaks down for values of fc 2 / Q 2 at which initial state radiation is still 
significant. We do not want our simplified shower model to depend on parton distribution 
functions, so we make a rather crude approximation, 

R= {l + cX/QY* ' (31) 

where our default values for the parameters are cr = 2 and tir — 1. 

With this factor R included, we should have a fairly good approximation for the emission 
probability as long as kj is large enough for the emission to be purely perturbative. To 
give ourselves some flexibility at small kj, we replace kj by kj + k 2 in the argument of a s 
and the factor 1/kj. Our default value for the parameter here is k 2 = 4 GeV 2 . Then the 
perturbative H is frozen when kj gets to be much smaller than k p . We then add back a 
simple non-perturbative function that gives us a chance to adjust the amount of radiation 
for smaller values of kj. 

This gives the complete initial state emission probability 

13 A kj + nl (l + c R kj/Q)»* + [kj + ^ ■ (32) 

Our default values for the non-perturbative parameters are c np = 1, K 2 p = 4 GeV 2 , and 
™n P = 3/2. It is intended that, with adjustment of parameters, we can include perturbative 
radiation from the active initial state partons together with radiation at central rapidities 
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and small transverse momenta that is associated with the underlying event and with event 
pileup. 

Our choice for the parameters is based on comparisons with results from Pythia, includ- 
ing the representation in Pythia of the effects of the underlying event. We used Pythia to 
produce events for p + p — > H + Z + X where both the Higgs boson and the Z-boson decay 
to muons. For this process, all hadrons are produced by initial state radiation. Although 
we did not impose a Pt cut on the Z-boson, the hard scattering scale here is similar to 
that for our signal and background processes. We looked for jets that were produced by the 
initial state radiation, selecting jets using the kr algorithm with R = 0.2 and counting all 
jets with rapidities in the range —2 < y < 2. The resulting distribution as a function of the 
jet transverse momentum kj is shown in Fig. [8] This distribution is to be compared with 

dN ls f d 4 p 



dk , 



J 2lT6{p2) 6m ~ kj) &{lypl < 2) ^ IS ' (33) 



This curve, with our choice of parameters, is shown in Fig. [8] along with two more curves cor- 
responding to the two terms in H^. The jets described by if is are primary jets that can split 
to produce the jets modeled by Pythia, so we have made the primary jet spectrum some- 



what harder than the Pythia jet spectrum. In Sec. XI, we comment on whether the choice 



of these and other parameters affects the numerical results from shower deconstruction. 

VI. FINAL STATE QCD SHOWER SPLITTINGS 

In this section, we define the main part of the simplified shower, QCD shower splittings. 

A. Splitting probability for g — > g + g 

The splitting vertex for a QCD splitting g — > g + g is represented by a function H ggg as 
illustrated in Fig. [9] We call these the conditional splitting probabilities. Here the condition 
is that the mother parton has not split already at a higher virtuality. 

Let us examine what we should choose for H ggg for a g — > g + g splitting. We take the 
mother parton to carry the label J and we suppose that the daughter partons are labelled 
A and B as shown in the figure. The form of the splitting probability depends on which of 
the two daughter partons is the softer. We let h be the label of the harder daughter parton 
and s be the label of the softer daughter parton: k s < k^. 

By definition, k s < k^. We first look at the splitting in the limit k s <C k^. The splitting 
probability is then dominated by graphs in which parton s is emitted from a dipole consisting 
of parton J and some other parton, call it parton k. If s = A, then the emitting dipole is 
formed from parton h = B and parton k = /c(J)l, while if s — B, then the emitting dipole 
is formed from parton h = A and parton k = /c(J)r. The choice of k depends on which of 
the two daughter partons is parton s, so where needed we will use the notation k(s) instead 
of simply k. 

For H, we start with the dipole approximation for the squared matrix element (with 

^dipoic ~ — (4vra s ) . (34) 

2 Ps-PhPs- Pk 
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FIG. 9: Splitting function for final state g — >■ g + g splittings. 

We use 

2p s -p h = 2k s k h [cosh(y s - y h ) - cos(0 s - <j> h )\ 
~ k s k h [{y s - y h ) 2 + {(j) s - <p h ) 2 } 

= k s k h 9 2 sh , (35) 
2p s -p k ^ k s k k 9 2 sk , 
2p h -p k ^ k h k k 9 2 hk , 



where 



o 2 sk = (ys-y k ) 2 + (<t>s-<t>k) 2 , (36) 
oik = (vh - Vk) 2 + (4>h - <t>k) 2 ■ 



Thus 



8na s C A 6 2 hk 

"""" " fc2 ' 1 ' 



This function is singular when parton s is soft, since it is proportional to 1/k 2 . It is singular 
when parton s is parallel to parton h. It is also singular when parton s is parallel to parton 
k. We can partition i^dipoie into two parts, one, H s h, associated with emission from parton 
h and one, H sk , associated with emission from parton k. (Here we treat parton s as very 
soft and regard parton h after the emission and parton J before the emission as the same.) 
We write 

H s h = -f^dipolc x A hk , ^ 



H sk — -ffdipole x A-kh 



where 



A' 

^hk 



9 2 sk 



2 sh + o. 



ak 



A', 

so that 



° 2 h 



(39) 



kh (ft _L ^2 



4* + A' kh = 1 . (40) 
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This dipole partitioning function is that of Ref. |43], Eq. (7.12), adapted to the small angle 
approximations used here. For a Catani-Seymour dipole shower, one uses a different dipole 
partitioning function. 

With this choice, we have 



H. 



8na s C A 



sit 



'hk 



kl 01191 + 6* 



(41) 



ski 



We can improve this a little so that it works better when parton s is not extremely soft. We 
recall that, for parton s soft, /ij ~ k s kh0 2 sh and that k% ~ kj and the angles of parton J are 
close to those of parton h. Thus we take 



H 



87ra s Ca k 2 



/4 



k s k h 9 2 3h + 91 



(42) 



sk 



The angular factor 



g{y s , S 



®lk 



91 



9 2 

a sk 



(43) 



is of some interest. We plot it in Fig. 10 It enhances radiation into the region between parton 
h and parton k and disfavors radiation at angles much greater than the angle between parton 
h and parton k. The variable "pull" [45] is designed to separate signal and background events 
based on this factor. Here, the same effect appears as a natural part of a parton shower 
based on color dipoles. 

So far, we have an approximation that is good in the limit of emission of a soft gluon. 
This approximation is also good when the gluon labeled s is collinear with the mother parton 
direction as long as k s <C kj. When the two daughter partons are nearly collinear, we have 



h, 
kj 
k s 

kj 



(44) 



l-z , 



where z is the momentum fraction carried by gluon h. Our splitting function is proportional 
to 

! (45) 



k s k h z(l - z) 

This is right for (1 — z) 1 but it has corrections when 1 — z is not small. The complete 
DGLAP splitting kernel for collinear splittings is 



2C A 



[!-*(!-*) 
z(l-z) 



(46) 



Thus we should replace 



Thus we take 



H 



k 2 h 2 
K J _^ K J 



si I 



k s kh kgk^ 

8iia s Ca kj 
I 1 j k s kh 



1 - 



s^h 



1 - 



s^h 

w 



1 2 



9 2 sh + d lk 



(47) 
(48) 
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FIG. 10: The angular enhancement factor g(y s , <j) s ) of Eq. (43). The coordinates are (y s — yh, 4>s — 



(f>h)- The color connected parton k is at coordinates (0.1, 0). This figure is adapted from Ref. 

We need to add another ingredient: /ij cannot be too large. Suppose that the mother 
of parton J is parton K and the sister is parton J'. We need to be able to neglect /ij and 
fi 2 j, in the calculation of (pj + pj>) 2 = [i 2 K - With a little kinematic analysis, we see that 
neglecting /i 2 and /j, 2 , is a good approximation if 



kj k K 
kj> k K 



(49) 



We can enforce this condition in an approximate way by requiring 

2 A_ < 



j k K 

2 ,,2 



(50) 

<_ 

kj> " k K 



2 ^J!_ < 



For this reason, we include in H a factor 0(2[i 2 /kj < fj^/kjc). We know fi 2 K from the 
shower history. If there is no mother parton because parton J was produced in the hard 
interaction or by initial state bremsstrahlung, we take fi 2 K /kK = 2k j, so that the virtuality 
ordering condition becomes simply fi 2 < k 2 . 

This same condition, iterated, restricts the daughter virtualities: 



2 /4 < /4 

(51) 

k s kj 
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Pa Pb 




H. 



qqg 



PJ 



Pa Pb 




H, 



qgq 



PJ 



FIG. 11: Splitting functions for final state QCD splittings of a quark or antiquark, including a b 
or b quark. 



This gives a splitting probability H: 



H, 



999 



*3 



/ij k s kh 



1 - 



k x k 



1 2 



s^h 

w 



6lk 



k 



K 



(52) 



Here we evaluate a s at the virtuality scale of the splitting. When there is no color connected 
parton visible, we are forced to simplify this to 



no-fc 



a s (fi 



'■1 



fx j k s kh 



1 - 



kak, 



s^h 



B(2^<^ 



k 



(53) 



Here there is no restriction on the angles y s , 4> s of the emitted soft parton. This is potentially 
a very bad approximation, but in our case the approximation is tolerable because the emitted 
soft parton is necessarily within the fat jet. When, in addition, there is no mother parton 
K, this becomes 



no-K 



8*0, 



k 2 j 



[ij k s kh 



1 - 



k s kh 



(54) 



B. Splitting probability for q — ^ q + g and q — >• q + g 



Quarks and antiquarks can radiate gluons. These splittings are represented by the split- 



ting probabilities H qqg and H qgq that are illustrated in Fig. 11 We treat the splitting of 



a bottom quark as identical to the splitting of a light quark, neglecting the bottom quark 
mass. We take the splitting probability to be 



H. 



qqg 



H. 



qgq 



8na 



a B (n 2 j) kj 



m3 



kn 



+ \ h < k 



99 



gk 



K 



(55) 



The derivation follows the derivation that led to Eq. (52). Here k g is the transverse momen- 
tum of the gluon, k q is the transverse momentum of the quark or antiquark, and kj is the 
transverse momentum of the mother quark. Then using k q /kj « z and k g /kj ~ (1 — z), the 
factor containing these ratios gives the collinear splitting function 



C h 



1 + z 2 



(56) 



21 




FIG. 12: Splitting function for final state QCD splittings that produce a qq pair. 



in the collinear limit. 

There is an angle factor in which q labels daughter quark or antiquark, g labels the 
emitted gluon, and k labels the color connected partner of the quark or antiquark. If there 
is no color connected partner in the fat jet, this angle factor is to be omitted. 

There is a theta function that restricts the mass /i j of the daughter pair to be less than 
fi 2 K kj/(2kx), where K labels the mother of parton J. With our approximations for shower 
histories, a quark or antiquark always has a mother parton. 



C. Splitting probability for g — >• q + q 

We need one more QCD splitting probability, for g — > q + q, including g — > b + b as 



illustrated in Fig. 12 . Note that this splitting is important because g — > b + b is the main 
background for the H — >■ b + b signal, so we need to keep track of g — > b + b splittings even 
if they have a small probability. 

To construct the splitting function that we need, we can start with the q — > q + g splitting 



function in Eq. (55). We can take the collinear limit, setting the angle factor to 1. Then we 



replace P qq , Eq. (56), with z ~ k q /kj and (1 — z) ~ k g /kj, by 

P qg = T R [z 2 + (1 - zf] (57) 
with z ~ k q /kj and (1 — z) « k q /kj. This gives 

ffa , a = 8xrR ^S±M e f2f5 < ^) . (58 ) 

/ij kj V. kj k K J 

Note that this function is big for small /ij in the limit in which the quark pair is collinear, 
but that there is no additional singularity when the quark or antiquark is soft. 

For a gluon splitting to b + b we use H g i b = H gqq as given above. For a gluon splitting to 
(w, u), (d, d), (s, s), and (c, c), we include all four cases at once by using (n { — l)H gqg , where 
(n f -l)=4. 

There is a theta function that restricts the mass /ij of the daughter pair to be less than 
fi 2 K kj/(2kK), where K labels the mother of parton J. If there is no mother parton K, this 
theta function becomes 0(/ij < k 2 ). 
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VII. THE SUDAKOV FACTOR IN THE FINAL STATE SHOWER 



We have given definitions for splitting probabilities in the simplified shower. An important 
part of a parton shower event generator is the probability that a parton that was created 
at a virtuality scale p? K has not split before it finally does split at a scale /j, 2 . This is the 
Sudakov factor and has the form exp(-S), where S is the integral of the splitting probability 
down to the scale fi 2 . In this section, we explore how to approximate S. 



A. Variables for parton splitting 



To evaluate the Sudakov exponent, we need to understand in some detail the integrations 
for combining two partons. 
We use 

(4—1 iiwl dk L dy l (59) 

We consider integrations over the momenta of partons A and B that we would like to combine 
to make parton J: 

d 4 lihy l dk * /tf i<hr S dkl S dVB h° - ■ m 

Now we insert 

1 = /^(2vr)V(^+p B - Pj ) (61) 



and use Eq. (59) for J d A pj. This gives 



1 Uk] [ dyj f dl * t dA 



4(2tt) 3 J J J * J r J 2n J 2ir J 2ir 
1 1 



x 



J dk\ J dy A J d(f) A J dk\ J dy B J d(j) B ^ 



(2tt) 2 16 
x 5 4 (p A +Pb -Pj) ■ ■ ■ ■ 

In the second line, we have six variables, k A , y A , 4>a, k B , y B , and (fi B , restricted by four delta 
functions. This leaves an integration over two variables. We choose one of the variables to 
be the momentum fraction 

k A 



k A + k B (g3) 

A k B 

1 _ z = i r~ ■ 

k A + k B 

For the other integration variable describing the splitting, we use if defined by 

_ sinh(A ? //2)cos(A0/2) 
t&nif ~ cosh(A 2/ /2)sin(A0/2) ' lb4j 

where 

Ay = y A -y B , 
A0 = A -0 B . 
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Thus ip is approximately the angle about the origin in the (A0, Ay) plane. 
Then 



df& fdul 1 f dk 2 , dyj 



2tt J 2tt 4(2tt) 3 J J 
1 f d ^> ',/: Ids J- 



4(2tt) 2 J 2tt 



(66) 



where J is a jacobian to be discussed presently. We think about this as follows. We combine 
two subjets A and B of mass \ia and \ib- We display integrations over \ia and /is, but these 



integrations remain unaltered between the original integral (60) and the result Eq. (66). In 
the original integral, we integrate over k 2 , y and for the two constituent jets, with the 
standard factor 5 l/[4(27r) 3 ] for each. The subjets are combined to make a jet J described 
by kj, yj and <pj. We integrate over these variables with the standard factor l/[4(27r) 3 ]. 
This leaves variables /ij, z and ip that describe the splitting. Integration over these variables 
comes with a factor l/[4(27r) 3 ] and a jacobian J. 



In Eq. (66), a "strong ordering" approximation applies for jet masses, \ia «C fxj and 
Hb <C /ij- In turn, \ij is small compared to kA, ks and kj. For this reason, it is a sufficient 
approximation to set /xa = = in J. In the appendix of this paper, we calculate J with 
A* a — = 0. We find a quite simple result, 

j = sinh 2 (Ay/2) + (1 + ^/kj) sin 2 (A0/2) 

sinh 2 ( Ay/2) cosh 2 (A2//2) + (1 + sin 2 (A0/2) cos 2 (A0/2) ' 1 ' 

This result is even simpler when Ay and Aip are small. Since cosh(A?//2) w 1 and 
cos(A0/2) w 1 for small angles, we have 

J w 1 (68) 

for small angles. 



B. Splitting probability and the Sudakov exponent 

We will insert a splitting probability into each integration over the splitting variables, so 
that the splitting probability differential in the splitting variables /jfy,z,ip is 

dV = 4 d^j dz dip He~ s (69) 

Here we have approximated the jacobian J by its small angle form, J « 1. We also use 



small angle approximations in H, as in our expressions in Sec. VI For instance, we take 
^A/kj ~ z and ks/kj (1 — 2). 

5 The Feynman rules that we use for calculating squared matrix elements assume that momentum integra- 
tions are (2ir)~ 4 J d 4 p (2tt) 6(p 2 — fi 2 ), which gives this factor to accompany integrations over k 2 , y, and 



as in Eq. (59). 
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FIG. 13: Sudakov factor between final state splittings for a gluon. 



The corresponding total splitting probability is 



dV = 4(2^ J j dz j d ^ He ~ S ■ ( 7 °) 

Here H is the conditional splitting probability for a mother parton to split if it has not split 
at a higher virtuality than /Zj and e~ s is the probability, derived from H, that the mother 
parton has not split at a higher virtuality. Given the physical meaning of the Sudakov factor, 
one would like 

S « J dji 2 j®{ii 2 j < ft) Jdz Jd$ H(p A ,p B ) e({p A ,p B } E fat jet) . (71) 

Here p A and p B denote the momenta of the daughter partons in a possible splitting and /ij, 
Ay, and A(p denote parameters of the possible splitting. 

The theta function Q({Pa,Pb} £ fat jet) is present for the following reason. Parton J 
has, in each interval of virtuality dp 2 j, a probability to emit a soft, wide angle gluon that is 
not seen because it is outside the boundary of the fat jet. The probability for emission of 
such a ghost gluon is most substantial when the color connected partner for the emission is 
itself outside the fat jet. Fortunately, the momentum of the emitted ghost gluon is small, 
since it must be a soft, wide angle gluon. Thus it is a sensible approximation to ignore this 
momentum loss. Since we cannot see the ghost emissions, we ignore them completely. This 
means that we ignore them in the Sudakov exponent S by integrating only over splittings 
in which both daughter partons are in the fat jet. 



C. Sudakov exponent for gluon splitting 

As stated in the previous subsection, the Sudakov factor is the probability that the mother 
parton J did not split at a virtuality above /j/j. Thus the Sudakov factor is exp(— S), where 
S is the probability for the mother parton to have split at a value of \xj that is greater 
than the value at which the splitting did, in fact, occur. The corresponding Sudakov factors 
are associated with the propagators in our shower history diagrams. For instance, for a 



gluon, the factor exp(— S g ) is indicated in Fig. 13 There are three contributions to S, 



J 9i 



corresponding to g — > g + g, g — > q + q, and g — > b + b. Note that the total S g appears in 
exp(— S g ) independently of whether the gluon ultimately decays to g + g, q + q, or b + b. In 
this section, we work out the contribution from g — )■ g + g. 



We start with in Eq. (52) for H ggg with k s /kj replaced by z and kh/kj replaced by (1 — z) 



in the case that the label s of the softer daughter parton is s = A or kh/kj replaced by z 
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and k s /kj replaced by (1 — z) in the case that s = B. Since H is symmetric under k s -H- kh, 
the choice of s does not affect the form of the result. However, now s = A corresponds to 
z < 1/2 and s = B corresponds to z > 1/2. This gives 

77 999 ~8.C A -^ ^ + ^2-<-j . (72) 



In the angular factor 9\ k /[9 2 sh + 0^J, we use the notation from Eq. (36) that 9\p = (y a ~ 
Up) 2 + (0a — 0/?) 2 - The angular factor is one for small angles 9 s h and is small when 6 sh 3> 9\ k . 
Thus it is approximately a theta function that requires 9 s h < Qhk- Here 0/^ is approximately 
the angle 9k( s ) between the mother parton and the parton k(s) that carries the color line of 
the mother parton that is carried by the emitted soft parton. Thus we replace 

° u ^e(9<e k(s) ) . (73) 
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ak 



This is the angle-ordering approximation to the true dipole matrix element [%7| . It is a 
rather crude approximation locally in angle space, but is a pretty good approximation after 
integrating from large 9 to small 9. With this approximation, we have 

«^^^ C8 «l«»)e( a g<g . (74) 

We can translate the restrictions on 9 s h to restrictions on z. From Eq. (A60) of the appendix, 
we have, in the limit of small angles, 

j| « z(l - z) 9 2 sh . (75) 



Thus for z < 1/2 the relation 9 sh < 9 k ^ becomes 

2 7.2 

'k(A) J 



z(l-z) >¥ -%. (76) 



Presuming that the right hand side of this inequality is much smaller than 1 , we can simplify 
this approximately to 



Z> ¥~¥' (77) 

k(A) J 



Similarly, we have a restriction on how small (1 — z) can be, 



(l--)>^-S- (78) 



9 2 k 2 



These inequalities can be combined as 



1 fij 1 1 fij 



<^<l-_g. (79) 
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Thus 



H, 



999 



8ttCa 



x ei 



a s (jl 2 ) [1 - z{\ - z)f 



Pj 
9 2 k 2 



< z < 1 - 



9 2 

U k(B) 



e 



< (*k 

kj k K 



(80) 



For the theta function Q({pa,Pb} G fat jet) in Eq. (71), we note that if O^is) is much 



smaller than the fat jet radius Rp, the theta function that imposes angular ordering, Q(9 2 < 
^fc(s))' wm amios t always enforce that Pa and Pb are in the fat jet, so that Q({Pa,Pb} £ 
fat jet) = 1. On the other hand, sometimes there is no color connected parton with label 
k(s) in the fat jet. Then we use Eq. (53), which effectively defines 9^) — °°- I n this case, 
the theta function ®{{Pa,Pb} £ fat jet) limits 9 to a maximum value on the order of the 
fat jet radius Rp. We take a simple approximation and replace ®({Pa,Pb} £ fat jet) by 
Q{9 2 < Rq), where R is an adjustable parameter with default value R = Rp. Thus we 
understand that we should make the replacement 



9 



k(s) 



(81) 



when there is no color connected parton k(s). 

In the case that parton J is the parton that has no mother parton K because it originates 



a jet, we use Eq. (54) for H. This amounts to making the replacement 



2k 



K 



(82) 



when there is no mother parton K. 



With these approximations for H, can insert H ggg into Eq. (71) to obtain 

a s {pl 



°ggg ~ 




-*(i-*)r" 

z(l-z) 



(83) 



where we understand that we are to make the replacement (81) in the case that there is no 



color connected parton k(s) and the replacement (82) in the case that there is no mother 
parton K. Here we have performed the integration over ip since, with our approximations, 
the integrand does not depend on ip. 

Note the structure of this. We integrate half the DGLAP kernel over p 2 and z, with limits 
on the z integral from the angular ordering approximation to the quantum coherence of soft 
gluon emission from color dipoles. We have half of the DGLAP kernel for g — )■ g + g because 
we are integrating over the phase space for two identical particles and need a statistical 
factor 1/2. 

We can perform the integration over z, giving 
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Here we have omitted terms that are suppressed by a power of ^j/[k 2 9l (A) } or fJ%/[kj9h B \]. 



'k(A)\ 

We can perform the integration over p, by changing variables to a s using 



where 60 = (33 — 2nf)/(127r). We take the number of flavors to be rif = 5. We write 

1 1 



(85) 



log 
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a s (0k(A)9k(B)k 2 j) «s(/4) 



(86) 



This gives 



S, 



999 



Ca 



log 
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a s (kjfi 2 K /(2k K )) 
1 1 



® s {0 k (A)9k(B)k 2 j) 



ll&o 
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"s(yUj) a s {kjn 2 K /(2k K )) 



(87) 



Since fi 2 < kjfj, 2 K /(2kK), this quantity is positive as long as the partner angles 9p.(A) an d 9k(A) 
are not too small. However, since S < is unphysical, we replace S ggg — > S ggg Q(S ggg > 0) 
just to be sure that we are never enhancing an unphysical region by having e~ s > 1. 

We also evaluate the Sudakov exponent for a g — > q + q splitting. Here we use H g q q from 
Eq. (58). This gives 

Sm" J^e^jK^K^^^JdzT^ + il-z) 3 ] • (88) 



We can perform the z-integration to give 

2Ti 



<7 - ksi 

>~>gqq ~ 
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(89) 



Then, we can perform the /x 2 integration using Eq. (85) to give 



O7T0n 



log 



«s(/ij) 



a s (k J jj 2 K /{2k K )) 



(90) 



Adding 5^ and one copy of for each quark flavor, including the 6-quark, we obtain 
the complete Sudakov exponent for gluon splitting 



S 9 = S 999 Q ( S 999 > ) + "f 3 , 



gqq 



(91) 



D. Sudakov exponent for quark splitting 



The Sudakov factor for a quark splitting is illustrated in Fig. 14 The corresponding 



Sudakov exponent is given by Eq. (71) using H qqg from Eq. (55). In H ggg we replace the 
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FIG. 14: Sudakov factor between final state emission of a gluon from a quark or antiquark. The 
quark or antiquark flavor can be b or u, d, s or c. The previous splitting can be either a gluon 
emission, a g — > q + q or g — > b + b splitting or a Higgs boson decay to b + b. 



angular factor 2 k /[9g q + 0g k ] by 0(0 < 9 k ) as in Eq. (55). In turn, the restriction on 9 
amounts to a restriction on z, 

n 



This gives 
S, 
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z > 
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o\ kj 
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We can perform the ^-integration to obtain 

dfir — ( o o k_, 
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Here we have neglected terms suppressed by a power of /Xj/ (/cj^ 



We can now use Eqs. (85) and (86) to perform the /ij integration, giving 



119 
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(95) 



As in the case of gluon splitting, it is possible that, after our approximations, S gqg is negative. 
Since S < is unphysical, we define the complete Sudakov exponent for a quark to be 



Sq — Sggg Q(S qqg > 0) 



(96) 



just to be sure that we are never enhancing an unphysical region by having e~ s > 1. 

Sometimes there is no color connected parton with label k in the fat jet. Then, as in 
Eq. (81) for S g , we make the replacement 6 k —> Rq- 



E. After the last splitting 



If, in the shower history h, parton J does not split, then we look at its virtuality fi 2 and 
include a factor e~ S{ > or e~ Sq , as illustrated in Fig. 15 that represents the probability for 
parton J not to have split at a virtuality above the final virtuality fi 2 . 
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FIG. 15: Sudakov factors for partons with no further splittings. 



In principle, we should also include a factor J dH representing the probability that parton 
J did finally split at virtuality fi 2 . We do not know the splitting angle 6 for this splitting. 
We do know that 6 was less than -R m i cr ojet, the radius parameter for the fcy-jet algorithm 
that we used to define the microjets: if 6 were larger than -R m i cr ojet, the jet algorithm would 
not have merged the daughter partons to form the microjet. Thus we would calculate J dH 
by integrating the differential splitting function over the region 6 < -R m i cr ojet- We do not, in 
fact, include a splitting factor J dH because this factor is independent of the shower history 
h and independent of whether we are looking at signal histories or background histories. 
Thus it cancels from \- Since we do not need this factor, we do not calculate it. 



F. Sudakov factor for initial state emissions 



What are the Sudakov factors for the initial state emissions? The initial state emissions 
can conveniently be ordered according to the value of k 2 . The Sudakov exponent to go from 
a previous emission scale k\ to the new scale k 2 without a visible initial state emission is, 



using Eq. (32) 
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(97) 



Here we only count emissions into the region in which the decay products of the emitted 
parton will be seen as part of the fat jet. Approximately, we can take 



dy 



9(p £ fat jet) = nRl , 



(98) 



where Rp is the radius parameter that defines the fat jet. Then 
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(99) 



The initial state shower starts at a transverse momentum scale equal to the scale Q 2 /4, 
where Q 2 is defined in Eq. (21) and represents the scale of the hard interaction. It ends at 



a scale k 2 nt , where k cnt is the smallest transverse momentum of a microjet that can register 
in the detector, for instance k cut = 0.5 GeV. In general, there are multiple initial state 
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FIG. 16: Splitting probability He s for Higgs boson decay. 



emissions. We get a Sudakov factor for each one, times a factor for not having an emission 
between the last one and k 2 nt . The product of these is exp(— Sis) where 



S 



is 
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The factor exp(— Sis) is independent of the splitting values kj , fc 



(100) 



It does depend 



on the hard scattering scale Q 2 , which varies from event to event. However, note that Q 2 
is independent of the shower history and is the same for shower histories that represent 
background and signal processes. Thus the factor exp(— S tot ) will cancel exactly between 
signal and background factors in our observable x, so we can simply replace 



exp(-Si S ) -»■ 1 



(101) 



VIII. HIGGS DECAY PROBABILITY 



A light Higgs boson decays most often into b + b. Since we consider only the b + b decay 
mode, it suffices to treat the Higgs boson as if it always decayed to b + b. In the sections 
on splittings in a parton shower, we have specified a conditional splitting probability H, the 
probability for a splitting at a given virtuality fi 2 if the parton has not split at a higher /j, 2 . 
The total splitting probability is then He~ s , where e~ s is the probability that the parton 
has not split at a higher /ij. In this section, for the Higgs decay, we specify the total decay 
probability He~ s , depicted in Fig. 16 



The light Higgs boson is a very narrow object. In the narrow width approximation, the 
differential decay probability is 



He 



16tc 2 5{m 2 b - b - m 2 H ) 



(102) 



The normalization is arranged so that the total probability that the Higgs decays, using the 



integration measure in in Eq. ( 70 ) , is 1 : 



1 



4(2tt) 



dz / dcp He~ s = 1 



(103) 



Although a low mass Higgs boson is a very narrow object, the precision of its mass recon- 
struction is limited by detector resolution effects and by the loss of momentum resolution 
caused by grouping final state particles into microjets. To take these issues into account, we 
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treat the Higgs boson decay as if the invariant mass of its decay products can be anything 
within a ±Am# window around the physical Higgs mass, m#. Thus we artificially modify 
the differential decay probability to 

He _ s = e(\m b - b -m H \<Am H ) 
Amu Am# 

Our default value for Aran is 10 GeV. 



IX. b- TAGS 



We have described in Sec. II B how we assign 6-tags T, F, or none to microjets produced 
by PYTHIA or HERWIG in a way that mimics imperfect 6-tagging in an experiment. Tags 
T or F are assigned only to microjets that are among the three highest microjets in the 
event and, additionally, have px > p^ g > where we take = 15 GeV. 

In this section, we examine how to assign probabilities that a given 6-tag value will 
be generated in the simplified shower. We seek to simulate the probabilities with which 
the algorithm specified above generates tj values T, F, or none when operating on events 
generated by the full Pythia or Herwig. 

We suppose that we are given a microjet state, with momenta pj for each microjet and 
with a T or F 6-tag for each microjet that has large enough transverse momentum. We 
need to estimate the probability Pj(T) that microjet j receives a tag tj = T and and the 
probability Pj(F) that microjet j receives a tag tj = F. Then if, in fact, tj = T, we include 
in P({p, t}v|S, h) (for a signal history h) or P({p,t}^\B,h) (for a background history h) a 
factor -Pj(T). If tj = F, we include factor Pj(F). 

How should we calculate Pj(T) and P,(F)? We note that the situation is simpler than 
for a real Pythia or Herwig shower because each microjet consists of precisely one parton 
and each parton i has a definite flavor which can be b or b or could be a flavor that is not 
b or b, namely q or q or g. We make the definition as follows, using the probabilities P(T\b) 



and P(T|~6) defined in Sec. II B 



• If a microjet j is a b or b quark, then we say that tj = T with a probability 
Pj{T) = P(T\b) and tj = F with a probability Pj(F) = 1 - P(T|6). 

• If microjet j is not a b or b quark, then we say that tj = T with a probability 
Pj(T) = P(T|~6) and tj = F with a probability P,(F) = 1 - P(T|~6). 



X. CONSTRUCTING SHOWER HISTORIES 

We have now described how to calculate a probability P({p,t}^\S, h) for each signal 
history h and a probability P{{p, i}jv|B, h) for each background history h. We simply look 
at the diagram that describes the shower history and associate a factor with each element of 
the diagram. Now we need to generate shower histories. Because our method for combining 
daughter jets to form a mother jet is so simple, we can construct a set of possible shower 
histories in a fairly simple fashion. 

We begin with a list of the starting microjets. We divide these into two sets in all 
possible ways. One set consists of decay products of partons emitted as initial state or 
underlying event radiation, the second consists of the decay products of the parton (a gluon 
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for background or a Higgs boson for signal) that is produced in the hard interaction and 
creates bulk of the fat jet. 

We divide the set of the microjets associated with initial state emissions into any number 
of non-empty subsets. Each of these subsets is associated with one parton emitted in the 
initial state. 

Now consider the set of microjets associated with the hard parton. In a shower history, 
the hard parton splits into two partons. The first of these eventually splits to make a subset 
of the final partons. Call this the set L. The second of these eventually splits to make the 
complementary subset of the final partons. Call this the set R. Thus to generate the first 
splitting of the hard parton, we choose the set L and the set R. 

For each possible first splitting, we proceed to the second splittings. We can start with 
the set L. We divide this into subsets LL and LR. Each of these choices represents a 
possible splitting. We can simply continue this way until we reach a parton that consists of 
exactly one microjet. 

Each parton emitted in the initial state, as constructed above, consists of a subset of 
the microjets. If there are more than one microjets in this subset, we can divide it into 
left and right subsets, which describes a splitting of this parton. Again, this process can be 
continued until we reach a parton that consists of exactly one microjet. 

Note that each parton in the developing shower history consists of a subset of the micro- 
jets. Thus we know that the momentum of this parton is Pi: summed over this subset. We 
do not need to know anything about the later shower history of this parton to calculate its 
momentum. Thus as soon as we have generated a parton splitting, we have the information 
to calculate the probability for this splitting. The splitting probabilities contain various 
theta functions that can make the splitting probability equal to zero. When this happens, 
we can abandon the splitting and try another. 

Evidently, the shower histories and the corresponding probabilities can be calculated 
recursively with a simple computer program. That is what we have done. 



XI. NUMERICAL RESULTS 

We have now seen what shower deconstruction is. In this section, we explore how effective 
it is for separating signal from background for p + p — > H + Z + X — > H + £ + + £~ + X. 
We apply the shower deconstruction method to events generated by Pythia, with some 



comparisons using Herwig also. The event selection was described in Sec. II A 



Suppose that we base our analysis on counting events above a cut x, using the integrated 
cross sections s(x) and b(x) defined in Eq. (10). 6 What value of \ should one choose? 



If integrated luminosity J dL is available, the expected statistical significance of counting 
events with t}jv) > X is 



(JdL) 



b(x) 



21 1/2 



(105) 



Thus one would choose the value of x that maximizes s 2 /b. 



6 It would be better to use a likelihood ratio based on the full distribution of ds(x)/dx and db(x)/dx, but 
the use of a simple cut is easier to describe. 
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FIG. 17: Plot of s 2 /b versus s, where s and b are denned in Eq. ( 10 ). We use samples of signal and 
background events generated by Pythia as in Fig.[T] This is the same plot as in Fig. [2] except that 
we plot s 2 /b instead of s/b. The total signal cross section with the cuts used is <tmc(S) = 1.57 fb. 
We also show a point corresponding to a signal cross section cjbdrs(S) = 0.22 fb and background 
cross section <tbdrs(B) = 0.44 fb that we obtained using the method of Ref. [I]. 

In Fig. [TJ we displayed the \ distribution for signal and background. We used this 
information to display s/b as a function of s in Fig. [2j In order to understand the statistical 
significance of a counting experiment with a simple cut on x, we have seen above that one 
wants to look at the maximum of s 2 /b. For that reason, in Fig. 17, we display the information 
from Fig. [2] as a plot of s 2 /b versus s. We have used here the function x{{p^}n) from our 
simplified shower algorithm. If we could somehow use Xmc({p, Qn), using the same Monte 
Carlo that we use to generate events, we would obtain a curve for s 2 /b versus s that is 
everywhere higher. No algorithm could produce a curve above this limiting curve, but we 
have no way of determining the limiting curve. 

We see in Fig. [17] that one can achieve a fairly good statistical significance with, say, 
an integrated luminosity of JdL = 30 fb -1 . With s 2 /b w 0.26 and this luminosity we 
have N(S)/,/N(B) « 2.8. We can compare to the method of Ref. [I] (BDRS). Applying 
this method with our data sample, we find a signal cross section obdrs(S) = 0.22 fb and 



background cross section o"bdrs(B) = 0-44 fb. We have plotted this point in Fig. 17 The 
corresponding statistical significance with JdL = 30 fb -1 is 1.8. Of course, this analysis 
ignores all systematic uncertainties. 

In the analysis presented above, we include events with zero, one, and two 6-tags. Then 
shower deconstruction has to overcome a signal to background ratio of about 1/1700 in the 
complete event sample in order to extract a few events with a signal to background ratio of 
order 1. One suspects that, in fact, the events with zero or one 6-tags do not contribute much 
to the discriminating power of the method. Accordingly, we now explore what happens when 
we give shower deconstruction an easier job by restricting the event sample to just events in 
which there are two 6-tagged microjets among the three microjets with the highest transverse 
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X 

FIG. 18: (icrMc(B)/(ilogx for background events (upper curve) and eto"Mc(S)/c?log x fo r signal 
events (lower curve) for samples of signal and background events generated by Pythia. We use 
the cuts described in Sec. |II A| and, in addition, require that at least two of the three highest px 
microjets with px > 15 GeV have positive 6-tags. 



momenta that have, additionally, px > 15 GeV. With these cuts, the signal sample is 0.39 
fb and the background sample is 11 fb. We lose a lot of signal events, but now the signal to 
background ratio in the event sample is only about 1/30, so the job remaining for shower 
deconstruction is easier. 

In Fig. 18 we display the functions <i<7Mc(S)/<ilogx and cicrMc(B)/(ilogx for the two 6-tag 
sample. We again find a region with s > b. In Fig. [T9l we display the information from 



Fig. 18 as a plot of s Jb versus s. We also show the s Jb versus s curve from Fig. 17 for 



all events with no restriction on 6-tags and the point that we obtained using the method of 
Ref. [3]. 7 We see that for s ^ 2.5 fb, s 2 /b with the restricted event sample is smaller than it 
is with the unrestricted event sample. However for s ^ 2.0 fb, s 2 /b with the restricted event 
sample is about the same as with the unrestricted event sample. 



The formulas that define the simplified shower used to construct Fig. [19] contain a number 

2 



of parameters that reflect nonperturbative physics. Among them are c np , /t~ p , n Qp , cr 



and k 2 in Eq. ( 32 ) , N^ d{ in Eq. ( 23 ) , and N^ df in Eq. ( 24 ) . There are other parameters like the 
factor 2 for thenardness cut on splittings in Eq. (50) that could have been set differently. 



We have not systematically tested whether the performance of shower deconstruction as 



reflected in Fig. 19 is sensitive to the parameter choices, but we have tried some variations. 
Typically we found that c?o"Mc(B)/c?logx for background events and rfo"Mc(S)/(ilogx for 
signal events change in the same direction. Thus we find that the curve in Fig. [19] is not 
very sensitive to the parameter variations that we tested. 8 

We have used Pythia [33] for our comparisons. What would happen if we used Herwig 



7 The method of Ref. [3] uses only events with two &-tags. 

8 We did find that s 2 jb could be increased by making the Sudakov exponent for gluon splitting a bit larger, 
but we have not explored this further. 
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FIG. 19: Plot of s 2 jb versus s for events with at least two 6-tags among the three highest pt 
microjets that have pr > 15 GeV in addition. We use samples of signal and background events 



generated by Pythia as in Fig. [18} We also show the curve from Fig. 17 for all events with no 
restriction on 6-tags (dashed curve) and the point that we obtained using the method of Ref. [I]. 



[31] instead? We show in Fig. 20 the cross sections daMc(B)/d\ogx and d<TMc(S)/<ilogx for 
two 6-tag samples of signal and background events generated by Pythia and by Herwig. 
We have normalized the cross sections within our cuts to be the same for both Pythia and 
Herwig, so that we are looking at differences in shape rather than normalization. We see 
that the behaviors obtained with the two event generators are quite similar but that with 
Herwig a somewhat larger fraction of the background events have large \. That there are 
differences is not a surprise since both event generators work at leading order in perturbation 
theory for their splitting kernels and make approximations with respect to color and spin of 
partons. One lesson from this is that in experimental applications of shower deconstruction 
or of other jet substructure measures one will want to test the Monte Carlo cross sections 
against experiment. 



In Fig. 21 we compare results from the two 6-tag sample using Pythia and Herwig for 
s 2 /b as a function of s. We also show results using Pythia and Herwig for s 2 /b using the 
BDRS method. For Pythia, these are the results that were exhibited in Fig. [19} We see 
that there is about a 30% difference between Pythia and Herwig results. Again, this level 
of difference using leading order event generators is not a surprise. 



XII. CONCLUSIONS 

We have proposed a method, shower deconstruction, for separating signal and background 
events when we have a definite theory in mind for the signal as well as for the standard model 
background with the signal process omitted. We have explained the method using a simple 
signal process, p + p — > H + Z + X — > H + £ + +£~ +X. Here the event selection is chosen so 
that the Higgs boson that we hope to find is boosted to a substantial transverse momentum. 
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FIG. 20: do"Mc(B)/^logX f° r background events and do"Mc(S)/dlogx for signal events for samples 
of signal and background events generated by Pythia and by Herwig. We use the cuts described 



in Sec. II A and, in addition, require that at least two of the three highest pt microjets with 
Pt > 15 GeV have positive 6-tags. The solid (blue) lines are for Pythia while the dashed (red) 
lines are for Herwig. At small x> the background curves are on the top and the signal curves are 
on the bottom. 
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FIG. 21: Plot of s 2 /b versus s for events with two positive 6-tags. We compare the distribution 
of s 2 /b for events generated with Pythia as in Fig. 19 to the same distribution using events 



generated with Herwig. We normalize the total signal and background cross sections with these 
cuts to be <tmc(S) = 0.39 fb, <7mc(B) = 11 fb. We also show points that we obtained using the 
method of Ref. [3]. Using Pythia we found cjbdrs(S) = 0.22 fb and ctbdrs(B) = 0.44 fb, as in 
Fig. [191 while using Herwig we found obdrs(S) = 0.20 fb and ctbdrs(B) = 0.49 fb. 
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The shower deconstruction method itself is quite general and could be applied to signal 
processes with more structure or perhaps to signal processes in which the sought massive 
objects are not highly boosted. 

The idea of shower deconstruction can be described in very few words. With data at 
hand, one begins by clustering final state particles in a region of the detector (the "fat 
jet" in our example) into much smaller jets, the microjets, using the fc^-jet algorithm. Al- 
ternatively one could use some other jet algorithm or one could use topological clusters 
defined directly using the calorimetry of the experiment. This gives a fairly fine grained 
description of the event, with the momenta Pi and possibly flavor tags U for each micro- 
jet. In order to keep within reasonable bounds for computer resources, one can limit the 
number N of microjets by discarding the lowest transverse momentum microjets as neces- 
sary. One wants to be fine grained enough to see not only the direct decay products of a 
sought heavy particle but also gluon radiation that reflects the color structure of the signal 
or background final state. Then one computes approximately the probability P({p, t}u\S) 
to obtain the observed microjet state {p, t}jv from the signal process and the probability 
P{{P-> Oiv|B) to obtain the microjet state from a background process. We construct the 
observable x({p^}n) = P({p, t}N\S)/P{{p, £}at|B) as the ratio of these and use x t° dis- 
tinguish signal from background. The value of x is calculated using a simplified shower 
algorithm that tries to mimic what a partitioned dipole shower with initial state radiation 
and underlying event contributions would give. The microjets are treated as intermediate 
state partons in the shower. We want the calculation to be as accurate as possible, but 
it needs to be an analytic calculation that can be executed with a not-to-large amount of 
computer time for each event. There is a tension between these goals. We expect that other 
workers will be able to improve on the compromise algorithm that we have described in this 
paper. 

This method is similar in spirit to the matrix element method [5HH]. There, if one started 
from the microjet configuration {p, t}^, one would compute x{{p^}n) from the squared 
matrix element for the signal or background process convoluted with the parton distribution 
functions, integrated over the momenta of unobserved partons. If one were to use a number 
of partons N that is greater than the minimum possible number for the desired signal and 
background and if one were to calculate x({p^}n) analytically, one would have something 
close to the shower deconstruction method. In one sense, one would then have a better 
approximation to nature than the simplified shower algorithm of this paper because one 
would be using the exact squared matrix element rather than a soft-collinear approximation 
to it. However, one would be missing the Sudakov factors. Without Sudakov factors, the 
probability for a parton splitting becomes infinite as the virtuality of the splitting tends to 
zero. With Sudakov factors, the probability for a parton splitting approaches zero as the 
virtuality of the splitting tends to zero. For this reason, one needs the Sudakov factors. 

We have found that in our simple example the shower deconstruction can achieve a 
signal/background discrimination superior to that of Ref. [I]. Furthermore, shower decon- 
struction has some features that suggest that it may prove useful as a practical tool. First, 
it is quite general, although further development is needed to apply the general method to 
other signal processes. Second, it is modular, with modules corresponding to QCD parton 
splitting, initial state radiation, underlying event contributions, Sudakov factors, and heavy 
particle decay. The modules can be improved independently and inserted into the general 
scheme. Third, the method has at least the potential to work for quite complicated signal 
processes. 
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Appendix A: The jacobian 

In this appendix, we analyze the integral 

Ij = YqJ dk A J d UA J d(f) A J dk 2 B J dy B J d<f) B 5 4 (p A +PB - Pj) x / . (Al) 

Here p A and p B are the momenta of two jets that together form the jet with momentum 
pj. In our application, the two constituent jets have non-zero masses, /ia and ji B . However, 
the masses /ia and jj, B are small compared to the jet transverse momenta kA and k B and 
compared to the combined jet mass, hj. Thus it is a good approximation to neglect the 
constituent jet masses; furthermore, doing so leads to a substantially simpler result. We 
therefore set /ia = Us = 0. With this choice, the (+, — , 1,2) components of the momenta 
of the jets are (with p ± = (p° ± p 3 ) / y/2) 

Pb = (^^k B e yB ,^k B ,e~ yB ,k B cos (fi B ,k B sin (f) B ^j , (A2) 
Pj = sjkj + fj?j e yj , -j= sjkj + n 2 e~ VJ , kj cos 0j, kj sin , 

We wish to write Ij in the form 

I j = [dz [dip Jx f . (A3) 



(A4) 



Here z is a momentum fraction defined by 

k A 



k A + k B 
Then 

l-z = — (A5) 
k A + k B 

and 

2(1 -*> = (rap- (A6) 
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We define the variable ip by 



_ sinh(Ay/2)cos(A0/2) 
tan ^ " cosh(A<//2)sin(A0/2) ' (A7) 

where 

Ay = y A -y B , 

A(j) = (j) A - (j) B . 

Thus ip is approximately the angle about the origin in the (A0, Ay) plane. We need to 
calculate the jacobian J. 

To proceed, we define unit vectors 

n3= Br aj '^' ' ) ■ (A9) 

ni = (O,O,cos0j,sin0j) , 
n 2 = (0,0, -sin0j,cos0j) . 

These are orthogonal to each other and normalized as unit vectors along the coordinate axes 
in a convenient reference frame: n M ■ n v = g^ v . We thus have 

16 

x 5 ((Pa + Pb -Pj) ■ ni) 5((p A +p B - Pj) • n 2 ) 
x $((pa+Pb - Pj) ■n )5((p A +p B -pj) • n 3 ) x / 



dVA / dy B \ dk\ / / d(f) A \ d(j) B 

(A10) 



Let us examine the effect of 

$((Pa + Pb ~Pj) ■ n 2 ) = 6((k A + k B ) • n 2 ) = 5(/cAsin(0 A - 0j) + k B sm((fi B - <pj)) . (All) 

Here we use boldface symbols to represent transverse vectors. We can use the delta function 
to perform the integration over <p B \ 

I d(p A [ d(j) B 5((p A +p B - Pj) ■ n 2 ) ■ ■ ■ = I d(p A - — . — T • • • . (A12) 

J J J k B \cos{(j) B - (pj)\ 

Then 

k A 

sin(0 B - cj)j) = -— sin(0 A - 4>j) • (A13) 
k B 



We want to change the integration variable to A0 = <f>A~ 4>b- From Eq. (A13) we have 

k A sin(0 A - <j>j) = -k B [cos((p A - <p B ) sin(0 A - <f>j) - sm((p A - <p B ) cos((j) A - (j)j)\ . (A14) 
That is 

tan {<f) A - <pj) = - — — — . A15 

kA + k B cos A<p 
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Thus 

[K A + KB COS A(p) z 



We also derive 



so 



Since also 



we have 



2/ , , x (A; A + A; B cosA0) 2 

C ° S ^ ~ ^ = k A + k B + 2k A k B cosA (j) ' (A17) 

MA, A \ 7 /C B + /CA COS A0 

^ - 0J) = fcB fei + 4 + 2^cosA0 dA * • (A18) 

2/, , x + /C A COSA0) 2 

C ° S (0B " ^ = g + 4 + 2k A k B cos A0 ' (A19) 

MA A\ k B COs((j)B ~ . s 

^ - ^ = [fci + fc| + 2fcAcosA^/2 ■ ( A2 °) 



Additionally, we note that 

[k\ + k 2 B + 2k A k B cos A<p] l/2 = kj . (A21) 

Thus 

J d(f) A J d<j) B 5((p A +p B -pj) ■ n 2 ) ■ ■ ■ = j^j- ■ ■ ■ . (A22) 
With this result, we have 

J J = 77 I d VA I d VB [ dk 2 A [ dk\ 1 ^ 



16 J J J J J kj (A23) 
x S((p A +Pb~ Pj) ■ ni) S((p A + Pb - Pj) • n ) 5((p A + Pb - Pj) • n 3 ) x / . 

We next turn to the elimination of the delta function with n^. We note that 

KiPA +Pb ~Pj) • n 3 ) = S(k A sinh(y A - yj) + k B smh(y B - yj)) . (A24) 

We can use this delta function to eliminate the integration over y B : 

I dy A I dy B 5((p A + p B - Pj) ■ n 3 ) • • • = / dy A — • • • . (A25) 

J J J k B cosh(|/ B - yj) 

We want to change the integration variable to Ay = y A — y B . We have 

k A sinh(y^ - yj) = —k B sinh(y B - yj) . (A26) 

Thus 

fcAsinh^-yj) = -k B [cos}\(y A - y B ) sinh(y A -yj) - sinh(y A - Vb) cosh(y A -yj)] ■ (A27) 
That is 

, / \ k B sinh Ay , . nnS 

^(y.-yj) = kA + kBC0S * Ay ■ (A28) 
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Thus 

d(SA - ,,) = kB cosh 2fa - VJ) ( *; + + *;^*, . (AM) 



We also derive 



so 



Since also 



,2/ \ (k A + k B cosh Ay) 2 

cosh (y A - yj ) = kl + k% + 2kAkBCQshAy , (A30) 



ks + kA cosh Ay 
' k\ + k 2 B + 2k A k B cosh Ay 



%a - J/j) = fc B t-2 , , 2 , o; . _ u A „, dAy . (A31) 



we have 



,2/ (ks + k A cosh Ay) 2 

cosh (y B - y j} = kl + k2B + 2kAkBCQshAy , (A32) 

,/ x k B cosh(y B - yj) 

^ - ^ = [A;i + A;| + 2A; A A; B coshAy]V 2 dA V " 



We also note that 

fc^ + fc B + 2/c^/cs cosh Ay = A;^ + k 2 B + Ik^ks cos A0 

+ 2/c J 4/cs(cosh Ay — cos A0) 
= k] + 2p A • p B 
= ^ + /i 2 . 



Thus 



With this result, we have 

1 f ,,2 f ,,2 [dA<f> f dAy 



Ij — — I dk A I dk 



(A34) 



y dy A y dy B «J((pa + Pb - Pj) ■ n 3 ) ■ ■ ■ = J -^=== ■■■ . (A35) 



IQJ A J B J h J v^JT74 (A36) 
x $((pa+Pb -Pj) • ni) 5((p A + p B -p 3 ) ■ n ) x / . 

Now we would like to use the remaining delta functions to eliminate the integrations over 
k and kg. 

For the delta function involving no, we have 

Kip a +Pb -Pj) ■ no) = 5(k A cosh(y A - yj) + k B cosh(y B - yj) - aj) , (A37) 
where we abbreviate 

«J = \Jkj + Hj ■ ( A38 ) 
Using our results expressing cosh(yA — yj) and cosh(ye — yj) in terms of Ay, this is 

f k A (k A + k B cosh Ay) + k B (k B + k A cosh Ay) \ 
S((PA+PB-pj)-n ) = 5^ [k 2 A + kB + 2kAkBCOshAyV/ 2 *j) ■ (A39) 

That is 

KiPA +p B -pj)-n ) = S([k 2 A + k 2 B + 2k A k B cosh Ay} 1 ' 2 - aj) . (A40) 
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We can write 

[k\ + k 2 B + 2k A k B cosh Ay] 1/2 - 



aj 



k A + k B + 2k A k B cosh Ay — a 2 
[k\ + k 2 B + 2k A k B cosh Ay} 1 / 2 + aj 



(A41) 



The denominator is not singular, so we can factor it out and evaluate it at the point at 
which the numerator vanishes: 



$((Pa + Pb - Pj) ■ n ) = 2a j 5 (k A + k 2 B + 2k A k B cosh Ay - a]) 



(A42) 



We will use this result below at Eq. (A52). 
For the delta function involving n x , we have 

$((pa +Pb ~Pj) ■ ni) = 5(k A cos(<p A - (f>j) + k B cos(4> B - 4>j) - kj) . 

Using our results expressing cos(0 7 4 — 4>j) and cos(0# — (pj) in terms of A0, this is 

k A {k A + k B cos A(p) + k B {k B + k A cos A<p) 



$((Pa+Pb - Pj) ■n 1 ) = 5 
That is 

We can write 



[k\ + k\ + 2k A k B cosA0]V2 
S((PA +PB-pj)-n 1 ) = 5([k 2 A + k 2 B + 2k A k B cos A(f)} 1/2 - kj) 



-k 



[k A + k B + 2k A k B cos A0] 



1/2 



k\ + k 2 B + 2k A k B cos Acj) — k 2 
[k 2 A + k 2 B + 2k A k B cos A0]V2 + ^ 



(A43) 

(A44) 
(A45) 

(A46) 



The denominator is not singular, so we can factor it out and evaluate it at the point at 
which the numerator vanishes: 



8{(pa +Pb- Pj) ■ n x ) = 2k j 5 [k\ + k\ + 2k A k B cos A<p - k 2 ) 
It will prove convenient to write this as 

KiPA +Pb- Pj) ■ ni) = 2kj5((k A + k B f - 2k A k B (\ - cos A0) - k 2 ) 



(A47) 



(A48) 



We will use this result below at Eq. (A50). 

Now let us change integration variables to (k A + k B ) 2 and 2k A k B , with 

k A k B 



\H A K B \ 



d{k A + k B ) z d{2k A k B ) . 



(A49) 



When we make this change of variables, we ought to introduce also a sum over the discrete 
variable that distinguishes between k A and k B , since (k A + k B ) and (2k A k B ) are invariant 
under interchange of k A and k B . However, we omit a special notation for this because we 
will soon change back to a variable z that does distinguish between k A and k B . 

We can eliminate the integration over (k A +k B ) 2 at fixed 2k A k B using the ni delta function 
from Eq. (|A48l): 



dk\dk 2 B 5((p A +p B - pj) ■ ni) = d(2k A k B ) 



2kjk A k B 
IP —PI 

\ K A K B\ 



(A50) 
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Here 

This gives 



(k A + k B f = kj + 2k A k B (l - cos A0) . 



dA(f> / dAy / dt 



IP _ p 



<S(A(t)) x / 



where we have defined 



t = 2k A k B 



and where A is the argument of the delta function in Eq. (A42), 

A = k A + k 2 B + 2k A k B cosh Ay — k 2 — jjL 2 j . 



From Eq. (A51), we have 

Thus 

so that 



k\ + k B = k 2 — t cos A0 . 
A(t) = t (cosh Ay - cos A0) - /ij 



(A51) 

(A52) 
(A53) 

(A54) 

(A55) 
(A56) 



dA(j) / dAy jjy^—j 2 



1 



k A — k B \ cosh Ay — cos A0 



(A57) 
(A58) 



where 

2k A k B (cosh Ay — cos Acfi) = ft 2 . 

This nearly completes the task set at the beginning of this appendix. Now, let us change 
to some more useful integration variables. 

Let us define a momentum fraction z according to Eq. (A4). We need to express z(l — z) 
as a function of Ay and A0. Using Eqs. (A58) and (A51), we have 



k A k 



B 



cosh Ay — cos A<ft 



(k A + k B 



2 fcj(cosh Ay — cos A<fi) + — cos Acf>) 



(A59) 



Thus 



z(l-z) 
From this, we calculate 



cosh Ay — cos A0 



k 2 (coshAy — cos A0) + /ij(l — cos A0) 



(A60) 



dz(l - z) 
dA(f) 

dz(l-z) 
dAy 



2z 2 (l-z) 2 k 2 
2z 2 (\-zfk 2 



1 + R) sinA</> 



sinh Ay 



(A61) 



where 



R 



k 2 j 



(A62) 
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We need another variable, ip, which we define according to Eq. (A7). The gradient of 
tanyj is 



d tan ip tan ip 

dA<j) sin 
d tan ip tan ip 



dAy sinh Ay 

We can use the partial derivatives to calculate the jacobian, giving 



dAd dAy 
That is, 

dAS dAy ■ 



sinh Ay sinA0 



2z 2 (l - z) 2 kj sinh 2 Ay + (1 + R) sin 2 A6 



d(z(l-z)) 



dtamp 
tan ip 



II - 2z\ 



sinh Ay sin A6 



2z 2 (l - z) 2 k 2 j sinh 2 Ay + (1 + i?) sin 2 Acj) sin yj cosy? 
With a little algebra, we find 



1 



cosh Ay — cos Acj> 



Thus 



dAd dAy 



sin <y? cosy? sinh Ay sinA0 

/i 2 1 1 — 2z\ [cosh Ay — cos Acj)} 

z 2 {l-z) 2 kj sinh 2 Ay + (1 + R) sin 2 A0 



Now we insert this result into Eq. (A57). There is a factor 

k A k B z(l - z) 



|p _ p I 



II - 2z| 



which cancels the |1 — 2z\ in the numerator of Eq. (A65). Then 

Ij ~ 4 z(l - z)kj sinh 2 Ay +(! + #) sin 2 A0 



x/ 



We can use Eq. (A60) to express fi 2 in terms of z(l — z) and the angles (Ay, Acj>): 



1 f , fr (cosh Ay - cos Acj)) + Ml - cos AS) 

Ij=- dz dif „ ; ^ 2A „, , n , D ,^ 2 A , X / . 



We can rewrite this as 
1 



/ dip 



sinh 2 Ay + (1 + R) sin 2 A</> 
sinh 2 (Ay/2) + (1 + R) sin 2 (A0/2) 



(A63) 



(A64) 



dz dip (A65) 



(A66) 



(A67) 



(A68) 



(A69) 



(A70) 



sinh 2 (Ay/2) cosh 2 (Ay/2) + (1 + R) sin 2 (A0/2) cos 2 (A0/2) 



x/ • 



(A71) 



This is the result that we sought. We note that since cosh(Ay/2) rj 1 and cos(A0/2) m 1 
for small angles (Ay, Ad), we have approximately 



(./ -s - J dz I dip f 



(A72) 
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when the integration is dominated by the small angle region. 
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